Sensitivity analysis of disease-information coupling propagation dynamics model parameters

The disease-information coupling propagation dynamics model is a widely used model for studying the spread of infectious diseases in society, but the parameter settings and sensitivity are often overlooked, which leads to enlarged errors in the results. Exploring the influencing factors of the disease-information coupling propagation dynamics model and identifying the key parameters of the model will help us better understand its coupling mechanism and make accurate recommendations for controlling the spread of disease. In this paper, Sobol global sensitivity analysis algorithm is adopted to conduct global sensitivity analysis on 6 input parameters (different cross regional jump probabilities, information dissemination rate, information recovery rate, epidemic transmission rate, epidemic recovery rate, and the probability of taking preventive actions) of the disease-information coupling model with the same interaction radius and heterogeneous interaction radius. The results show that: (1) In the coupling model with the same interaction radius, the parameters that have the most obvious influence on the peak density of nodes in state AI and the information dissemination scale of the information are the information dissemination rate βI and the information recovery rate μI. In the coupling model of heterogeneous interaction radius, the parameters that have the most obvious impact on the peak density of nodes in the AI state of the information layer are: information spread rate βI, disease recovery rate μE, and the parameter that has a significant impact on the scale of information spread is the information spread rate βI and information recovery rate μI. (2) Under the same interaction radius and heterogeneous interaction radius, the parameters that have the most obvious influence on peak density of nodes in state SE and the disease transmission scale of the disease layer are the disease transmission rate βE, the disease recovery rate μE, and the probability of an individual moving across regions pjump.


Introduction
At present, new coronary pneumonia is still raging around the world, and information about the disease will also spread on social networking platforms, and the spread of information may have an effect on or inhibit or promote the spread of the disease. These two types of spread are often show a coupling relationship. In our existing research result [1], we explored the influence of a single parameter on the model, but did not consider the influence of the interaction of the parameters on the model. In order to better understand the coupling mechanism of these two transmission processes and propose measures that can accurately control the spread of the disease, this paper conducts a sensitivity analysis on the parameters of the disease-information coupling transmission dynamics model, and uses quantitative methods to identify the important effects of the dynamics model factor. Sensitivity analysis is the prescriptive or quantitative analysis of the effect of model inputs (including model parameters) on model outputs [2]. In general, one might be interested in which parameters have the greatest impact on the output, and which parameters have negligible impact [3]. Model parameter sensitivity analysis can diagnose the model structure and identify the key parameters of the model, which is a key step in model establishment and application [4]. Sensitivity analysis can be divided into local sensitivity analysis and global sensitivity analysis. Local sensitivity analysis is usually carried out by calculating partial derivatives by analytical or numerical methods, usually by perturbing one parameter at a time [2]. However, this method can only evaluate the influence of a single parameter on the model output, and cannot evaluate the influence of the interaction between parameters on the model output.
Although it is easy to operate, it has great limitations due to the phenomenon of "same effect with different parameters". Global sensitivity analysis is to analyze the common influence of multiple parameters on the model output and the interaction among parameters in the whole parameter space. It is more suitable for the research and analysis of complex systems.
The global sensitivity analysis method is developed on the basis of local sensitivity analysis. Compared with local sensitivity analysis, global sensitivity analysis on the one hand takes into account the influence of the distribution and shape of probability density function of each factor, and on the other hand, all factors can change simultaneously during calculation and analysis. Cukier et al. [5], Iman et al. [6], Archer et al. [7], Saltelli et al. [8] and other scholars have successively studied global sensitivity analysis methods. The characteristics of global sensitivity analysis are as follows: the range of factor variation can be extended to the entire definition domain of the factor; each factor allows different ranges of variation and can vary simultaneously; not limited by model, nonlinear, non-superposition and non-monotonic models can be studied.
The global sensitivity analysis methods are mainly as follows: 1. Screening method. This method is usually used to deal with models with a large number of input factors, and the amount of computation is relatively small. When there are many factors in the model, first use the screening method to determine the factors that have a greater impact on the model output, remove the factors that have little impact, and then use other methods for sensitivity analysis, which can greatly simplify the calculation. However, the screening method can only do qualitative analysis, and cannot give specific quantitative results of the importance of one parameter over another parameter. (2) Monte Carlo method. It is a numerical simulation method that constructs random variables by random sampling from the probability distribution of known model inputs. Then, according to the calculation results of random variables, the uncertain factors of the output are determined, and then they are apportioned to the uncertain factors in the input. (3) Variance-based methods. Cukier et al. [5] originally used the Fourier Amplitude Sensitivity Test (FAST) and then extended it as in [9]. The variance-based method can calculate the sensitivity index by decomposing the output variance into the first-order and higher-order effects of the input. Commonly used methods include important estimation method, Fourier method, Sobol method and so on. This method has unique advantages in sensitivity analysis due to its variance-based analysis.
Sensitivity analysis based on variance has made some progress and has been applied in many fields. Song et al. [10] used variance-based sensitivity method and GRSA method to conduct global sensitivity analysis on headless rivet model and Ten-bar structure model. Savolainen [3] used the Sobol method based on variance to conduct a global sensitivity analysis of feedback control stochastic process models, and discussed how to use global sensitivity analysis in dynamic and stochastic process modeling cases. Scholars such as Zhou [11] introduced the sparse grid integration method into the calculation of the sensitivity index based on variance and applied it to the automobile front axle model. The practical application shows that this method inherits the advantages of sparse grid integration in integral estimation, and controls the computation while ensuring the accuracy of sensitivity analysis. Scholars such as Fonoberova discussed the global sensitivity analysis based on the multi-agent model [12][13][14][15][16][17][18]. The Sobol method based on variance decomposition has been shown to be an appropriate sensitivity analysis method [3,19].
The dynamic model of disease-information transmission (abbreviated: DMDT) is a widely used method. The effect of parameter sensitivity on the results is ignored, leading to deviations in the results. Therefore, how parameter changes affect the results of disease information dynamics models has become a problem that needs to be studied. Currently, there are relatively few sensitivity analyses for disease and information dissemination network models. This paper uses the widely used and representative Sobol method to analyze the global sensitivity of the disease-information coupling propagation dynamics model constructed in the published article [1]. At the same time, we further improved the model and proposed a disease-information coupling model based on heterogeneous interaction radius. The influence of each parameter and the interaction between the parameters on the two models is obtained by quantitative analysis. This study can complement the application of sensitivity analysis methods in the field of disease-information coupling transmission.
The rest of this article is as follows: Part 2 introduces the disease-information two-layer coupling model and coupling propagation dynamics model to be analyzed. Part 3 introduces the principle and calculation of Sobol global sensitivity analysis method. Part 4 is sensitivity analysis and simulation results and analysis of sensitive factors. Part 5 is the conclusion and outlook.

DMDT parameter selection
DMDT is based on the disease-information two-layer network model, which is divided into a disease layer and an information layer, representing social networks and physical contact networks, respectively. Information and disease spread in the information layer and disease layer respectively. In this double-layer network, nodes between layers are connected by dotted lines, indicating that these two nodes are the same individual. In this double-layer network model, we assume that the network structure of the information layer is static in the short term. This is because the social relationship between people is generally relatively stable in the short term and will not change much. The network structure of the disease layer is dynamically changing, because in real life people always move due to various factors such as work, life, travel, etc., and meet different people at different times, which causes the structure of the physical contact network to change from moment to moment. The transmission dynamics of the disease layer and the information layer use the SIR model. Individuals in the information layer have three states. State U I means that individuals have not received epidemic-related information. State A I represents that individuals have received epidemic-related information. State N I means that individuals have received epidemic-related information but do not pass that information to others. Individuals in the disease layer also have three states. State S E means individuals are not sick, but they can be infected by sick neighbors with a certain probability. State I E means individuals are already sick, and will infect their neighbors with a certain probability. Individuals in state R E are no longer infected and cannot infect other individuals. For detailed information of the model, please refer to [1].
In the above model, we assume that each individual has the same interaction radius, as shown in Fig 1A. However, in reality, the individual interaction radius should be heterogeneous [20]. Considering the influence of the heterogeneity of the interaction radius between individuals on the spread of disease, we have further improved the above model. In the improved model, each individual i has its own radius, denoted by r i . Individual i can be infected by infected individuals within the radius of r i . As shown in Fig 1B below. For simplicity, we only give 12 individuals here, and the interaction radius of each individual is: r 1 , r 2 ,. . ., r 12 . Individual 5 and individual 11 can be infected by individual 4 and individual 12, respectively, but individual 2 cannot be infected by individual 1. Since the interaction radius represents the neighborhood where a person may be infected by infected neighbors in the area, we also call it the susceptibility radius. We assume that, in our model, there are m different interaction radius values, which obey the distribution P(r j ), j = 1,. . .,m, where P(r j ) represents the node with the interaction radius r j proportion. Referring to the existing research [20], here we set the distribution of r as: r = [0.5 1 1.5] and P(r) The parameters involved in these two models, their meanings and value ranges are shown in Table 1.
In these two models, there are four output variables to be studied, namely: the peak density of nodes in the A I state of the information layer, which we denote by r A I here; the density of nodes in the N I state when the information layer dissemination approaches the end, which represents the scale of information dissemination, here is represented by r N I ; the peak density of nodes in the S E state of the disease layer is represented by r S E ; the density of nodes in the R E state when the spread of the disease layer approaches the end, which represents the scale of disease transmission, here is represented by r R E .

Principle of Sobol method
The Sobol method is a quantitative global sensitivity analysis method based on variance. Its basic principles are as follows: Given a square integrable function, the domain of the function is: The function can be written as an extension: Each of these terms is also square integrable over its domain of existence and is only a function of the corresponding factor in its subscript. Such as: , If each term in the above expansion has a zero mean, that is: , then all the items in the decomposition are one-to-one orthogonal, that is: Therefore, these items can be expressed using the conditional expectation of the model output Y: If the conditional expectation E(Y|X i ) on X i value has changed a lot, then X i factor is important. Therefore, the variance of conditional expectation can be considered as a general term for sensitivity. The variances of the items in the above decomposition are the important measures being sought. In particular, ;When we divide this by the unconditional variance V(Y) we get the first order sensitivity index. That is: It represents the main effect contribution of each input factor to the output variance. Two factors are said to interact when their effects on Y cannot be expressed in terms of the sum of their individual effects. Interaction is an important feature of the model and the key to the Sobol method.
Further decomposition of (4) and (5) can be obtained as follows: Where V(E(Y|X i , X j )) is the joint effect of (X i , X j ) on Y, and V(f ij ) is the joint effect of X i and X j minus the first-order effect of X i and X j . V(f ij ) is called a second-order or bidirectional effect [21].
is simplified to V ij , and so on, Eq (2) can be written into the ANOVA-HDMR decomposition equation as follows: Divide both sides by V(Y) to get: For factor X i , the total effect index refers to the total contribution of the factor to the change of model output, that is, the first-order effect of factor X i plus all the higher-order effects generated by the interaction. The first-order effect of X i is expressed by S i , and the total effect of X i is expressed by S T i . The first-order effect calculation formula is (6), and the total effect calculation formula is: When S T i = 0 or S Ti ffi0, X i is a non-influence factor, i.e. any value of X i within its value range will not significantly affect the value of the model output variance V(Y).

Sobol method calculation process
The number of samples is set as N (in this paper, it is set as 500). The larger the number of samples, the more accurate the results are [22]. The number of input variables is d (d = 6 in this paper). The general processing flow of Sobol method is as follows [23]: 1. The sample sampling method is generally based on Monte Carlo or its variants. Refer to the Sobol method [24,25], and use Sobol' quasi-random sequence to generate uniformly distributed (quasi) random numbers [22]. The realization of Sobolset function in Matlab, namely: Then construct an n×d matrix AB i , so that the i-th column of AB i is equivalent to the i-th column of matrix B, and the remaining columns are consistent with matrix A, namely: . . . j n;1 � � � j n;iÀ 1 j n;dþi � � � j n;d Where i = 1, 2, 3, . . ., So far, a total of (d+2) matrices of A, B, and AB i have been constructed, and a sample of input parameters on the interval [0,1] of (d+2)×n group is obtained. Calculate the model output of all input values in the sample matrices A, B and AB i , and get 3 N-dimensional matrices.
3. Calculate the sensitivity of each parameter according to the calculation formulas of the first-order influence index and total effect index above, namely: Where;

1) Sensitivity Analysis of Coupled Models with the Same Interaction Radius
Sampling in Matlab, the number of samples is 500, and the parameters are input into the disease-information double-layer coupling network model to get the output result. We ran the model 4000 times in total, and the time step of the model was 50 time steps each time. Calculate the sensitivity and total effect index of each parameter section with the obtained running results in Matlab, and obtain the sensitivity indexes of each input parameter to the four model output variables r A I ;r N I ;r S E ;r R E . The results are shown in Tables 2-5.
It can be seen from Table 2 that the most obvious factors affecting r A I are information propagation rate β I and information recovery rate μ I , while other factors have little effect on r A I .
It can be seen from Table 3 that the factors that have the most obvious effect on r N I are information transmission rate β I and information recovery rate μ I . The total effect index after each variable interacts with other variables is greater than the first-order effect index. Among them, the two factors of disease transmission rate β E and disease recovery rate μ E alone have a small impact on r N I , and the first-order impact index S i is only -0.1031 and -0.1193, but they have a significant impact on r N I after interacting with other factors. The total effect index S T i is 0.5584 and 0.3592 respectively.
It can be seen from Table 4 that the most obvious factors affecting r S E are disease transmission rate β E , disease recovery rate μ E . The total effect index of the probability of an individual moving across regions p jump is greater than the first-order effect index, indicating that its interaction with other parameters has a greater impact on r S E , while other factors have little effect on r S E .
It can be seen from Table 5 that the most obvious factors affecting r R E are the disease transmission rate β E , the disease recovery rate μ E and the probability of individual movement across the region p jump , while other factors have little effect on r R E .
2) Sensitivity Analysis of Coupled Models with Heterogeneous Interaction Radius Similar to the sensitivity analysis method of the previous model with the same interaction radius, we also performed a global sensitivity analysis on the improved coupling model of the heterogeneous interaction radius. The results are shown in Tables 6-9. It can be seen from Table 6 that the most obvious factors affecting r A I are information propagation rate β I and information recovery rate μ E . The total effect of information recovery rate μ I and disease transmission rate β E on r A I is significantly greater than the first-order effect, indicating that the impact on information dissemination becomes greater after interaction with other variables. In addition, in comparison with Table 2, it can be seen that the impact of disease on information is also more obvious in the coupled model of heterogeneous interaction radius.
It can be seen from Table 7 that the factors that have the most obvious effect on r N I are information transmission rate β I and information recovery rate μ I . Disease transmission rate β E and disease recovery rate μ E alone have little effect on r N I . The first-order impact index S i is only 0.0468 and 0.0143, but after they interact with other factors, they have a significant impact on r N I , and the total effect index S T i is 0.3145 and 0.1564 respectively.
It can be seen from Table 8 that the most obvious factors affecting r S E are disease transmission rate β E , disease recovery rate μ E . The total effect index of the probability of an individual moving across regions p jump is greater than the first-order effect index, indicating that its interaction with other parameters has a greater impact on r S E , while other factors have little effect on r S E .
It can be seen from Table 9 that the most obvious factors affecting r R E are the disease transmission rate β E , the disease recovery rate μ E and the probability of individual movement across the region p jump , while other factors have little effect on r R E .

Simulation results and analysis of sensitive factors
Based on the most influential parameter obtained by the above global sensitivity analysis, we select two values within the value range of this parameter to simulate and analyze the results. Each simulation result is the average result of 20 experiments.
Simulation results of the influence of different information dissemination rate β I values on r A I The fixed parameter values were μ I = 0.5, β E = 0.6, μ E = 0.1, r = 1, v = 0.03, ω = 0.2, and p jump = 0.01, and the values of β I were 0.1 and 0.7. The change of the density of nodes in the information dissemination state in the information layer over time is presented in Fig 2. It was found that the information dissemination rate increased from 0.1 to 0.7, the peak density of nodes in state A I of the information layer increased from 28.4% to 93.7%. Simulation results of the influence of different information dissemination rate β I values on r N I The fixed parameter values were μ I = 0.5, β E = 0.6, μ E = 0.1, r = 1, v = 0.03, ω = 0.2, and p jump = 0.01, and the values of β I were 0.1 and 0.7. The change of the density of nodes in the information recovery state in the information layer over time is presented in Fig 3. It was found that the information dissemination rate increased from 0.1 to 0.7, and the information dissemination scale increased from 94% to 99%.
Simulation results of the influence of different epidemic recovery rate μ E values on r S E   Fig 4. It was found that the epidemic recovery rate increased from 0.1 to 0.7, the peak density of infected nodes in the epidemic layer decreased from 41.5% to 3.3%.
Simulation results of the influence of different epidemic transmission rate β E values on r R E We conducted this simulation by changing the epidemic transmission rate β E and fixing the other parameters. The fixed parameter values were β I = 0.1, μ I = 0.5, μ E = 0.1, r = 1, v = 0.03, ω = 0.2, and p jump = 0.01, the values of β E were 0.1 and 0.5. The change of the density of nodes in the epidemic recovery state in the epidemic layer over time is presented in Fig 5. It was found that the epidemic transmission rate increased from 0.1 to 0.5, the scale of the epidemic spread increased from 28% to 64.5%.

Conclusion
The qualitative and quantitative analysis of the input and output of complex models and systems by the sensitivity analysis method is conducive to the diagnosis of the model structure, the identification of model parameters and the application of the model. In this paper, the Sobol method based on variance is used to analyze the global sensitivity of the disease information coupling dynamics model of the same interaction radius and heterogeneous interaction radius. There are 6 model parameters, namely: information transmission rate, information recovery rate, disease transmission rate, disease recovery rate, the probability of moving across regions and the probability of taking preventive actions. There are four output variables of the model: the peak density of nodes in the A I state of the information layer, density of nodes in the N I state when the information layer dissemination approaches the end (the scale of information dissemination), the peak density of nodes in the S E state of the epidemic layer, and the density of nodes in the R E state when the spread of the disease layer approaches the end (the scale of disease spread). Sensitivity analysis results show that: (1) The parameters that have the most obvious impact on the peak density of nodes in the A I state and the scale of information dissemination are the information dissemination rate β I and the information recovery rate μ I . Therefore, if the dissemination of information is to be controlled, measures need to be taken to control these two parameters; (2) The parameters that have the most obvious impact on the peak density of nodes in the S E state of the disease layer and the scale of disease transmission are disease transmission rate β E , disease recovery rate μ E , and the probability of individual movement across regions p jump . Measures need to be taken to control these three parameters to control the spread of the disease; (3) The parameter value range will significantly affect the calculation results of parameter sensitivity. It can be seen from our previous research results that in a certain parameter the value interval has a more obvious influence on the model output, while other value intervals have a smaller influence on the model output. Studies by other scholars have confirmed this point [26].
Although this article attempts to analyze the global sensitivity of the parameters of the disease-information double-layer coupled network model, there are still some shortcomings in the research of this article. Only the Sobol method is used and no other global sensitivity analysis methods were applied. In the next step, we will use other methods to analyze the sensitivity of the model, and compare the analysis results of the various methods. In addition, we will further divide the parameter value range to explore the influence of different parameter value ranges on parameter sensitivity. The control research of multivariate system can also provide reference for our future research [27,28].

Author Contributions
Data curation: Yang Yang. Formal analysis: Yang Yang.